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Nowadays, tiered architectures are widely accepted for constructing large scale information systems. In this 
context application servers often form the bottleneck for a system’s efficiency. An application server exposes an 
object oriented interface consisting of set of methods which are accessed by potentially remote clients. The idea of 
method caching is to store results of read-only method invocations with respect to the application server’s interface 
on the client side. If the client invokes the same method with the same arguments again, the corresponding result 
can be taken from the cache without contacting the server. It has been shown that this approach can considerably 
improve a real world system’s efficiency. 

This paper extends the concept of method caching by addressing the case where clients wrap related method 
invocations in ACID transactions. Demarcating sequences of method calls in this way is supported by many 
important application server standards. In this context the paper presents an architecture, a theory and an efficient 
protocol for maintaining full transactional consistency and in particular serializability when using a method cache 
on the client side. In order to create a protocol for scheduling cached method results, the paper extends a classical 
transaction formalism. Based on this extension, a recovery protocol and an optimistic serializability protocol are 
derived. The latter one differs from traditional transactional cache protocols in many essential ways. An efficiency 
experiment validates the approach: Using the cache a system’s performance and scalability are considerably 
improved. 

Categories and Subject Descriptors: H.2.4.o [Information Systems]: Database Management— Systems, Trans¬ 
action Processing ; H.3.4.b [Information Systems]: Information Storage and Retrieval— System and Software, 
Distributed Systems ; C.4 [Performance of Systems]: Optimization 

General Terms: Client-Server, Architecture, Transaction Management, Object Oriented 

Additional Key Words and Phrases: Caching, Application Server, Transaction Theory, Performance, Scalability 


1. INTRODUCTION 

Modern large-scale client-server-based information systems follow a tiered architecture. 
The most common solution is the three-tier architecture consisting of a presentation tier, 
an application tier and a data tier. E.g. for a typical web application, a servlet-enabled 
web server implements the presentation tier and a (relational) database system implements 
the data tier. Application server technologies such as EJB [Sun a] or corresponding parts 
of the .NET Framework [Microsoft ] are often used to realize the application tier. They 
offer an object oriented interface consisting of a set of service methods to their clients, the 
so called service interface. In order to centralize business logic but also for better system 
scalability, the different tiers are usually hosted on separate machines in a local network. 
This makes invoking a service method a costly affair, since it requires a remote method call 
which passes the application server’s infrastructure and often incurs database accesses. 

Consequently, application servers tend to become the bottleneck of an information sys¬ 
tem in respect to its performance and scalability. Many solutions have been proposed to 
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tackle this problem including dynamic web caching [Anton et al. 2002; Challenger et al. 
1999; Li et al. 2002], method caching [Pfeifer and Jakschitsch 2003], application data 
caching [jcache ], database caching [Grembowicz 2000; Luo et al. 2002; The TimesTen 
Team 2000] and special design patterns [Marinescu 2002]. 

We concentrate on method caching whereby results of service method calls are cached 
on the client side of an application server. E.g. in case of a tiered web application, an appli¬ 
cation server’s client is usually a servlet-enabled web server. Alternatively, an application 
server’s client could also be an end-user program with rich a graphical user interface. 

If the client code invokes a service method that does not have any side effects, its result 
may be cached for later reuse on the client side. If the client code calls the same method 
with the same arguments again, the result can be read from the cache without contacting 
the application server. 

[Pfeifer and Jakschitsch 2003] showed that this approach can be pursued transparently, 
so that usually neither the client-side nor the server-side application code has to be aware 
of a related cache’s presence. Moreover, it validated that a method cache can considerably 
improve performance and scalability of real world applications. 

For caching approaches the most challenging part is usually to guarantee cache con¬ 
sistency. [Pfeifer and Jakschitsch 2003] also demonstrates how strong cache consistency 
can be asserted for the price of added efforts on the part of an application developer who 
has to describe certain interdependencies between methods. However strong cache con¬ 
sistency does not cover the case where service method calls are demarcated by client-side 
transactions. 

Consequently, this paper extends the idea of method caching by addressing the case 
where the client code wraps service method invocations in ACID transactions. This type 
of transactions is explicitly supported by popular application server technologies such as 
EJB and .NET. This paper presents an architecture and a theory that enables transactional 
caching of method results on the client side while maintaining complete transactional con¬ 
sistency and in particular serializability. Moreover, we discuss how to preserve important 
recovery properties when using a transactional method cache. 

In this context many important assumptions differ from the ones that govern conven¬ 
tional transactional cache protocols such as presented in [Franklin et al. 1997]. In par¬ 
ticular, we do not assume that a protocol for transactional method caching can be tightly 
integrated into the database system that underlies the application server. In practice, such 
an expectation would be unrealistic because commercial database systems do not allow a 
deep engagement in their internal transaction manager. Instead we propose an independent 
component, called the ///-scheduler, for scheduling cached method results while asserting 
full transactional consistency. The ///-scheduler is located in between the application server 
and the underlying database system, cooperates with a transactional method cache on the 
client-side and makes conservative assumptions about the database system’s transaction 
management. 

The remainder of this paper is organized as follows: First we clarify the scope to which 
transactional method caching may be applied and explain how an application server archi¬ 
tecture should be extended to enable this caching approach (Section 2). In order to build 
an //(-scheduler and a related cache protocol, it is useful to extend the conventional notion 
of transactions. Section 3 develops a theory for transactional method caching on the ba¬ 
sis of the classical 1-version and multiversion transaction formalisms. Using this theory. 
Section 5 develops a serializability protocol for scheduling cache hits for cached method 
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Fig. 1. Architecture of an Application Server Supporting Client-Side Transactions 

results inside transactions. The protocol is optimistic but differs from existing transactional 
cache protocols such as OCC [Adya et al. 1995] in many essential ways. Before, Section 
4 discusses how conventional recovery qualities can be assured in the presence of a trans¬ 
actional method cache. To demonstrate that the approach pays off, the paper presents an 
efficiency experiment for an EJB-based application server system (Section 6). Section 7 
outlines the relationships between our contribution and existing caching approaches for 
web applications as well as existing transaction protocols. We conclude with a summary 
and prospects to future work. 


2. GENERAL ARCHITECTURE 

2.1 Client-Side Transactions for Application Servers 

This section highlights the general concept of client-side application transactions and the 
respective infrastructure. Figure 1 illustrates an architecture enabling client-side transac¬ 
tions in conjunction with service interfaces: An application server offers two interfaces, 
the service interface and the transaction interface. Both interfaces can be used via remote 
method calls from a client. E.g. for EJB, the service interface technically consists of a set 
EJB Home and EJB Remote Interfaces (which are Java interfaces) while the transaction 
interface adheres to the Java Transaction API [Sun c]. 1 Using these interfaces, a client can 
wrap a sequence of service method invocations in an ACID transaction. The application 
server executes the client’s service method invocations and relies on one or more trans¬ 
actional resources (e.g. databases) to enable transactional consistency. To achieve this, 
the application server state (as far as relevant to clients) is derived from the state of the 
transactional resources. If a transactional resource is a relational database, this is typically 
realized by SQL statements inside service method implementations or by object relational 
mappings between application server objects and database table rows. As shown in Figure 


1 Note that the term "service interface” abstracts from the actual number of programming language interfaces for 
an application server standard. 
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Context ctx = new InitialContext(); 

// Request an application transaction 

UserTransaction utx = (UserTransaction) ctx.lookup("java:comp/UserTransaction"); 
utx. begin (); // Begin the transaction 

Item item = itemSession. findItemById(20); //Invoke service methods as part of the current transaction 
if (litem.price > 42) { 
item.price = 42; 
itemSession.updateltem(item); 

} 

utx. commit (); // Commit the transaction 


Fig. 2. Example Code of a Client-Side Transaction Using EJB 


1 a service method implementation may therefore read and write data elements via the data 
access interface of the underlying database system. 

For every transaction that a client begins, the application server starts a transaction on 
every registered resource manager (e.g. a database transaction) and keeps it open for as 
long as the client transaction is open. All service method invocations inside a client’s 
transaction are tied to a respective resource transaction for every participating resource 
manager. To realize this, the resource managers are usually expected to provide a trans¬ 
action demarcation interface according to the XA standard [The Open Group ]. When 
committing a client transaction, the application server acts as a transaction monitor and 
commits all respective resource transactions using a two-phase commit protocol. Due to 
this mechanism, transactional qualities of resource transactions are more or less inherited 
by client-side transactions. E.g., if there is only one participating resource manager and it 
guarantees serializability then the client transaction will also be serializable. 

Note that typically, application servers do not guarantee global serializability across 
multiple resource managers but only ascertain local serializability and atomic commits. 
The approach of this paper does not try change this fact but offers the same degree of 
consistency in the presence of client-side method caches. Therefore, the actual number of 
resource managers is mostly irrelevant to this contribution (given that there is at least one 
such entity). 

Figure 2 presents an example of EJB-related code for a client-side transaction including 
service methods calls. 

2.2 Integrating a Transactional Method Cache 

This section explains how a transactional method cache can be integrated in the above ar¬ 
chitecture. It shows how a service method invocation is generally processed in the presence 
of a method cache and describes a base protocol for keeping the cache contents up-to-date. 

2.2.1 Base Architecture. Figure 3 extends Figure 1 by the components additionally 
needed for transactional method caching. As described in Section 1 the cache is located at 
the client and implements the application server’s transaction interface as well as its service 
interface. 2 


technically this can by realized by applying the the design pattern "dynamic proxy” ([Sun ]) or by generating 
the respective classes statically [Pfeifer and Jakschitsch 2003]. 
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Fig. 3. Architecture of an Application Including a Transactional Method Cache and an m-Scheduler 


For the client code, the presence of the cache is completely transparent - it performs 
its method calls as usual. However, service method invocations and calls to demarcate 
transactions are now intercepted by the transactional method cache. For every service 
method call the cache checks if a related method result is in its store. If so, it returns 
the result to the client right away. Otherwise it delegates the call to the server where it is 
(almost) executed as usual. The cache always forwards calls for demarcating client-side 
transactions to server. 

In order to exchange additional cache consistency information, all remote method invo¬ 
cations might transfer extra data. This is indicated in Figure 3 by a plus sign added to a 
respective label. (Most modern remote method invocation protocols allow for these kind 
of extensions.) When a method call arrives at the server, the additional information from 
the method cache is passed on to the m-scheduler. As soon as the call’s result is about to 
be returned to the client, the m-scheduler attaches consistency information which will be 
processed by the cache. 

The approach leaves the conventional message flow between client and server intact, 
since additional data is always piggy-backed to ordinary remote method calls. Only in case 
of cache hits, the information flow changes since client server communication is avoided. 
This lazy way of exchanging cache consistency information keeps the communication cost 
at a minimum but requires transactional method cache protocols that are optimistic. 

2.2.2 Base Protocol. The following paragraphs describe the base protocol for trans¬ 
actional method caching. Note that this protocol does not yet guarantee serializability. 
It merely asserts that cache content is created for read-only method invocations and that 
stale cached method-results will be invalidated soon after a respective write operation. In 
later sections we will see how the base protocol can be extended to ascertain transactional 
consistency. 
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Also note, that the base protocol as described next refers to just one client cache whereby 
the corresponding client might run several concurrent transactions. However, the protocol 
can easily be extended to function to with multiple clients. (The details are omitted in 
favour of a compact presentation.) 

Let m be service method and o.m(a) be a corresponding method invocation comprising 
the this-object o and the argument list a. When 0 . 111 (a) reaches the cache it checks if the 
result for the cache key ( 111 , 0 , 0 ) is in its store. For a cache hit, the result is returned to 
the client code straight away. Further, for every active local client transaction 7} the cache 
keeps an initially empty list L, and enters into it all method results that were returned from 
the cache on behalf of transaction 7], In order to do so, every cached method result is 
assigned a unique identifier which is entered in L, . 

If a cache miss occurs in 7) or if the client tries to commit 7), the respective method call is 
delegated to the application server. The method cache attaches the list L, of the respective 
transaction 1) to the call and sends it to the server. On the server side, the call is executed 
as usual, however L, is forwarded to a new component - the so called m-scheduler. The 
m-scheduler is in charge of scheduling the use of cached method results in such a way that 
a client transaction 7} remains consistent, i. e., serializable. It can do so because it knows 
all cache hits of 7) from the respective list L,, and also, it observes all data access operations 
that service method implementations perform via resource managers. 

Take a cache miss so that the method call o.m(a) from above might cause several read 
and write operations on a relational database. The m-scheduler observes these operations, 
keeps track of them in an operation list and passes the operations on to the database 
system. For now we assume that Z, consists of operations of the type r[x] and w[x] with x 
representing a data element of the database. However, as it will be discussed later, there 
are challenges in identifying data elements such as x. 

When the execution of 0 . 111 (a) finishes at the server, the m-scheduler checks if there 
are any write operations in the operation list . If not, the respective method invocation 
left the database state unchanged and will become a candidate for caching. In this case 
the m-scheduler associates a globally unique identifier (i,k) with 0 . 111 (a) where i repre¬ 
sents the transaction 7) in which 0 . 111 (a) was computed and k identifies 0 . 111 (a) inside 7], 
Moreover, the m-scheduler maintains a global table V to associate all identifiers of cached 
method calls (from all transactions) with all data elements that were read during a respec¬ 
tive method execution. So, for 0 . 111 (a) it will enter (i,k) and the respective data elements 
(such as known from If) in V. When the application server sends the result r from 0 . 111 (a)’ s 
execution to the client, the respective message also contains the tuple (i,k). This tells the 
cache that r should be cached and it saves both r and (i,k) together with the cache key 
( 111 , 0 ,a) in its store. 

If, on the other hand, 0 . 111 (a) did cause one or more write operations, the system behaves 
differently: Let x be a data element which was written on behalf of 0 . 111 (a). Using V the 111 - 
scheduler determines all identifiers of cached method results at whose computation x was 
read and collects them in an invalidation list h. The server attaches h to the result message 
which contains r and sends it to the client. When the client receives the message it removes 
all method results from the cache which are identified by elements in h. Eventually it 
returns r to the client code. 

To sum up, the m-scheduler needs identifiers for cached method calls like (i, k), lists like 
Li, lj and h as well as the table V to enable consistent transaction executions and to keep the 
cache up-to-date. Using Lj the m-scheduler gets to know what cached method calls were 
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interface DE {} // Representation of a data element (just a marker interface) 1 

class Mid { int k, 1; } // ID of a cached method result 2 

class Op { boolean read; DE x; } // Representation of a database operation r[x] or w[x] 3 

class T { // Representation of a transaction 7} 4 

int id; // Transaction ID 5 

List<Op> 1 = 0; //Database operations for the current method execution (/,) 6 

int nextMId = 0; //Counter for new IDs of cached method results 7 

8 

} 9 

class Req { //A service method call which is forwarded to the server 10 

int txld; // ID of the transaction containing the call 11 

Object o; Method m; Object [] args; //Method call details 12 

// Recent client-side cache hits for the given transaction (L,-) 13 

List<MId> L; 14 

} 15 

class Res { //Response for a service method execution 16 

Object r; //The execution’s result 17 

boolean cachable; //Whether the result is cachable or not 18 

Mid m = null; // If result is cachable: the ID of the result 19 

List<MId> h; // IDs of recently invalidated cached method results 20 

} 21 

22 

class MScheduler { //Representation of the m-scheduler 23 

Rel<DE, MId> V = 0; // Relates x with IDs of cached method results 24 

Map<int,T> txId2T = 0; // Relates a transaction’s ID with its transaction object 25 

26 

void handleRequest (Req req) { // m-scheduler part for handling a request of a service method execution 27 
for each m G req. L // Iterate over all recent cache hits of the considered transaction and schedule them 28 
methodOp (txId2T (req.txld), m); // (for details see later) 29 

} 30 

void completeResponse (Res res, T t) { //Complete the response of a service method execution 31 

res.cachable = true; //At first, assume that the result is cachable 32 

for each op G t.l 33 

if (! op. read) { //If the method executed a write operations, ... 34 

res. cachable = false; //... it is not cachable 35 

// Update h to invalidate the respective cache entries at the client 36 

for each m G V(x) res.h.add(m); 37 

} 38 

if (res . cachable) { //If the result will be cached, ... 39 

res.m = new Mid(t.id, t.nextMId++); II... generate its ID and ... 40 

for each op G 1 II... register it at the server using V 41 

V.put(op.x, res.m); 42 

} 43 

1. clear (); // Clear the database operations list for the next method execution 44 

} 45 

46 

} 47 


Fig. 4. Java Pseudo Code for the Base Protocol’s Aspects at the m-Scheduler 


accessed in a transaction. Using V the m-scheduler can tell what data elements were read 
to produce cached method results and also it can derive what cached method results must 
be invalidated. Using (i,k) the m-scheduler can associate cache hits with entries from V. 

Figure 4 illustrates the base protocol’s data structures and some of its important imple¬ 
mentation aspects at the server side. 3 The classes Req and Res represent the requests and 


Note that in order to represent data types conveniently, the code applies parametric polymorphism (also known 
as "generics” in the Java world [Gilad Bracha ]). E.g., the polymorphic type Rel<A, B> stands for finite relations 
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the responses of service method calls addressing the server. The classes’ field names match 
the names used in the protocol description from above. 

At the server side, the m-scheduler drives the base protocol in respect to handling re¬ 
quests and generating responses. In this context the application server is supposed to call 
MScheduler. handleRequest () when it receives a remote service method call. After the 
application server has computed the method call’s result r, it invokes completeResponse (). 
This way the m-scheduler can add all missing base protocol information to the response 
object. Eventually, the server sends the completed response object to the client. 

2.2.3 Integrated Transaction Scheduling. Although method caching happens on the 
client side, cache consistency is provided by the m-scheduler (on the server side). Without 
a transactional method cache in place, client transactions are mainly based on the trans¬ 
action management of resource managers. For this purpose, every resource manager has 
its own unit for scheduling transaction operations, the so called rw-scheduler. E.g. the 
rvv-scheduler applies a serializability protocol such as two-phase locking [Bernstein et al. 
1987], 2-version two-phase locking or FOCC [Harder 1984], Unfortunately, the use of 
cached method results is beyond an rw-scheduler’s control but still affects transactional 
consistency. Therefore, the m-scheduler and a respective rw-scheduler must cooperate in 
order to provide consistent client transactions. 

Since resource manager products such as relational database management systems 
(RDBMs) cannot be easily prepared for such an integration, we propose a layered approach 
for scheduling transactions in the presence of a method cache. Using this approach the 
resource manager is completely unaware of an m-scheduler and performs its tasks as usual. 

The m-scheduler intercepts all transaction operations that address the resource manager 
and on top of it, it schedules the use of cached method results. In order to do so, it makes 
conservative assumptions about the rw-scheduler’s behavior and handles conflicts result¬ 
ing from the use of cached method results and conventional write operations. Using the 
data structures from above it has all information at hand to perform this task. The next part 
of this paper is devoted to developing a theory for how an m-scheduler can produce seri¬ 
alizable transactions under these conditions. The general idea of separating different parts 
of a transaction scheduling process along certain types of data operations can be found in 
[Bernstein et al. 1987]. We build on this idea for creating an integrated scheduler consisting 
of an m-scheduler and an rw-scheduler. 

Note that it is a crucial requirement for the m-scheduler to observe all transaction opera¬ 
tions addressing the resource manager. Otherwise, it might miss potential conflicts between 
operations and therefore generate non-serializable histories. 

As mentioned above, there is an additional challenge when constructing an m-scheduler 
because it has to observe access operations in respect to single data elements from a 
database. E.g. if the m-scheduler should integrate with an RDBMS, database elements 
might be table rows. Since the m-scheduler acts outside of the RDBMS, it can only ob¬ 
serve database access on the basis of SQL statements. Unfortunately SQL statements spec¬ 
ify data elements only descriptively and so the m-scheduler is unable to directly identify 
data elements as needed. As a rather pragmatic solution to this problem, we expect an ap¬ 
plication developer to help out by providing the necessary information via some extra code 
inside service method implementations. The extra code is inserted after a corresponding 


R C AxB and the type Map<A, B> represents finite functions A—>B. 
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SQL statement and refers to the m-scheduler in order to tell it what data elements the SQL 
statement accessed. It is up to the application developer to find a useful representation for 
identifying data elements. From our experience, key values of table rows are mostly a good 
choice. 

3. TRANSACTION THEORY FOR METHOD-BASED CACHING 
3.1 MC-Transactions and MC-Histories 

In order to produce serializable histories in conjunction with method caching, one has to 
represent the use of cached method results in transaction histories. This section extends the 
notion of conventional transactions and 1-version histories such as presented in [Bernstein 
et al. 1987] by introducing a new operation that indicates the use of a cached method result 
inside a transaction. As opposed to conventional read and write operations we call such an 
operation a method operation. 

A benefit of method operations is that they accurately and naturally represent of the use 
of cached method results in a transaction formalism. More importantly, they enable the 
development and the verification of non-trivial serialization protocols for m-schedulers. 
One such protocol will be described in Section 5. 

For an intuitive understanding of method operations we take a look at a corresponding 
history before we come up with a proper definition for it. Consider the following history: 

Hi = r\\y}r\[x}ciW2[x}c2mYrj,[x]ci. 

How does it differ from a conventional 1-version history? First of all, we have read opera¬ 
tions with superscripts such as r\[x\. This operation is just like an ordinary 1-version read 
operation (e.g. like r\ [x]) except that the superscript 4 is an identifier for the method call 
on whose behalf the read operation was performed. The respective method call is executed 
on the server side and so it produces ordinary read operations at the resource manager. 
As the method call reads two data elements, there is a series of read operations with the 
same superscript, namely r\ [y] and r\[x}. Since the method call with the ID 4 in 7j only 
reads data, its result may be cached on the client side. Afterwards it is available for cache 
hits (which might occur in other transactions). Note that from a technical point of view, 
the superscripts for read operations are created and used by the m-scheduler. They are not 
visible and not relevant to a resource manager’s rvv-schcduler. 

Secondly, H\ contains the method operation m\ A . It reflects an access to a cached 
method result in transaction 7j. The index 3 specifies that m\ A belongs to Tjj. Furthermore, 
the superscript of m^ 4 uniquely identifies the cached method result to which it refers: It is 
just the result that was produced by the operations r\ [y] and r*[x] of 7]. So the number 1 in 
the superscript of m\ A refers to 7] and the number 4 identifies the method call with the ID 

4. 

We have just covered the most relevant aspects of MC-histories and how they extend 
conventional 1-version histories. The following definitions implement these ideas. 

DEFINITION 1. An MC-transaction 7) is a set of operations with a partial ordering 
relation where 

— 7} C {wi[x],r J j [x\,m 1 ^' 1 \ x is a data element A j,k,l £ N\{0}} U {a,',c,}, 

— Cli £ Cl f Tf, 

— \/p£Tr.pf {a i: Ci} => (p <i at V p <; a). 
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— Vrf[x\,Wi[x\ G Ti : rj[x] </ Wi[x] O -i(w,-[x] <, rj[x]). 

Besides introducing method operations, MC-transactions require every read operation to 
have a superscript. Note that a read operation’s superscript is only necessary to ’’reference 
it” from method operations as explained for the history H\ . 4 

DEFINITION 2. Let {T \,..., 7„} be a set of MC-transactions. An MC-history H is de¬ 
fined as H = |J ;=1 ^ with a partial ordering relation <G) [J" =1 <;. Furthermore, the fol¬ 
lowing condition must hold: 

Mm/ G H : k G {1,...,«} A Vr[[x] G H : r l k [x\ < in/. 

The last condition of Definition 2 ensures that every m/ refers to a T k , that exists in H. 
However, it is not necessary there exist any read operations of the form r l k [...] in H. 

DEFINITION 3. The function d ( p ) returns the set of data elements of an operation p in 
an MC-history H as follows: 

d{rj[x\) =d{wi[x\) = {.x},d(m/) = {x | 3 r[[x\ G H }. 

Further, a(p) shall be the type of an operation p G FI, so a(r/[x]) = r, a(wi\x\) = w and 
t LC 

a (m j ) = m. 

Two operations Uj,Vj G H conflict with each other, expressed by Uj ^ Vj, iff 

d(ui) fi d(vj) f 0 A (/Tj Tj A {a{uf) = w Va(v ; ) = w)) V 

( a(ui) = w/\a(vj) = m ) V ( a(ui ) = m/\a(yf) = w)^ . 

Obviously, the data elements that cause conflicts in respect to a method operation m/ 
are just the ones which are read by operations of the form r l k {...]. Consider the MC-history 
H\ from above. It holds the following conflicts (and no others): 

r\[x] It w 2 [x \, w 2 [x] H / [x], m\ A || w 2 [x]. 

Definition 3 states that conflicts inside a single transaction T,- are possible if one of the 
conflicting operations is a write operation and the other one is a method operation. To see 
why this is useful, consider the history 

H 2 = r\ [x]civv 2 [x]m|’ 1 C 2 . 

Here, W 2 [x] H m\' X is reasonable because m\ A refers to an x-value that was read before W 2 [x] 
is performed. 

As is common for conventional 1-version histories, we want to avoid MC-histories with 
unordered but conflicting operations. The next definition limits MC-histories in this re¬ 
spect. 

Definition 4. An MC-history H is well defined, iff 

Vp,q G H mc :p!/(q=^p<q\/q<p. 


4 Technically, superscripts for read operations form an extension of conventional 1-version transactions because 
a respective transaction may contain several read operations of the same data element whereas this is not the 
case for a transaction such as defined in [Bernstein et al. 1987]. However, this detail has no major impact on 
transaction theory. 
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For the rest of this paper we are only interested in well defined MC-histories. So from 
now on, whenever we refer to the term "MC-history” we actually mean ’’well defined MC- 
history”. 

Definition 5. The rvv-projection RW maps an MC-history H to a history RW(H) 
with all operations from H but its method operations, so RW (H) = {p £ H \ a(p) m}. 
Furthermore, it keeps all ordering relations from H, but those in which method operations 
are involved. 

If RW ( H ) = H holds for an MC-history H, it is called an rw-history. Similarly, if a 
transaction does not contain any method operations it is called an nv-transaction. 

As an example of an rvv-projection consider 

RW(H\) = r\ \y]r\[x]c\W2[x\c2r\[x]c-i. 

Apart from the superscript of read operations rvv-histories represent conventional 1-version 
histories. Later, rvv-projections will help us to formalize how an ///-scheduler and rw- 
scheduler split their work for producing an integrated schedule. Note that the rvv-schcdulcr 
only gets to see the rvv-projection of an MC-history. This means that formal qualities that 
the rw-scheduler should assert, may be associated with an rvv-projection but not an entire 
MC-history. 

3.2 Multiversion Histories 

This section briefly defines a slight adaption of conventional multiversion histories and 
multiversion serializability graphs. The adaption is necessary for a sound introduction of 
serializable MC-histories which follows in Section 3.3. 

DEFINITION 6. Let {7j,... ,T„} be a set of rw-transactions. A multiversion history H 
is defined as H = { h(p) \ p £ [J” =1 T) } with a partial ordering relation <. Further, the 
function h must fulfill the following criteria: 

— Va,-,c,-,w;[x] £ ULi T k '■ K a i) =«( Ah(ci) = c,- /\h(wi[x]) = w,-[x,-], 

— Vr'.[x] £ U3fc=t Tk : 3i G {1,... ,n} : /z(r'.[x]) = r'[x,], 

— Vi £ {1,...,«} : Vp,q G 7j: p <; q =>■ h(p) < h{q), 

— Vr' [x] e U*= t Tt : h (rj [x]) = r' [x,-] =>(«' = 0 V 3vv; [x;] £H : vv; [x,] < r l - [x/]), 

— Vr'.[x] GULi Tk - {h{r l j [x)) = r 1 j \xi\M^jACj£H)^c i £H. 

An X, is called a version of the data element x. 

The above definition assumes that prior to any write operation, there already exists an 
initial version xo for every data element x. 

Mainly for consistency reasons, multiversion histories maintain the superscripts of read 
operations as introduced by Definition 1. Apart from this, the here defined multiversion 
histories differ from the ones in [Bernstein et al. 1987] because h is not expected to map 
transaction operations r[ [x] with vv,- [x] <,- r,- [x] to r\ [x,]. The criterion would be too restrictive 
for the definition of serializable MC-histories from Section 3.3. However, for serializable 
multiversion histories, we still accomplish a similar result as in [Bernstein et al. 1987] 
because the definition of multiversion serializability graphs from below accounts for this 
issue. 
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DEFINITION 7. Let D be the set of data elements of all operations of a multiversion 
history H, so D = {x \ 3r,[xj] £ 77 V 3w,[x,] £ 77}. A version order <C establishes for every 
data element x £ D a total order of its versions, such that xo is the smallest version: 

Vx £ D : Vi,j £ N\ {0} : xo <C x, A (i j =>x; <Cx ; - Vxj -Cx,). 

A version order that adheres to the following predicate is called write version order: 

Vn’,[x,],Wj[x ; ] £ 77 : (w,[x,] < Wj[xj\ Vi = 0) =>x, -C Xj. 

Write version orders are specific version orders. As we will see, it turns out that we have 
to rely on write version orders in order to create a serializability theory for MC-histories. 

To keeps things short, we omit the definition of serializable (or more specifically 1- 
serializable) multiversion histories. Instead, we turn to the definition of multiversion ser- 
izalizability graphs straight away and assume that the reader is familiar with the underlying 
serializability theorem (see [Bernstein et al. 1987]). 

DEFINITION 8 . Let H be a multiversion history for the rw-transactions {T\,... ,T n } 
and <C be a corresponding version order. The multiversion serializability graph MV SG C 
{7T,..., T„} 2 for 77 and <C is given be the following predicate: 

( Tt,Tj ) GMVSG :<^Ci £ TjAcj £ Tj A 3r[ 1 [x/],w,„[x m ] £ 77 : 

(i = j =k = m At 7 ^ l A Wi[xj\ < rf [x/]) V (if= j /\m = i = l /\k = j)V 
(i ^ j Am = i Al = j Ax m Cx/)V (if= j Ak = i Am = j Ax/ -C x m ). 

Instead of writing ( Tj,Tj ) £ MV SG we simply write T\ —> Tj. If one of the last two 
disjunctive clauses holds, then 7] —> Tj is called a version order edge. 

Since Definition 6 enables multiversion histories with operations w L [x,] < rj [x ; ] and i f j, 
the first disjunctive clause in Definition 8 introduces graph edges for just this case. In other 
words: w, [x 2 ] < r\ [x,], i / j is impossible for committing transactions 7]- and 7) if MVSG is 
acyclic. 

3.3 Interpretation of MC-Histories 

Intuitively, not all serial MC-histories should be considered serializable. To understand 
this, let us reconsider H\ from above: m * ' 4 accesses a cached method result which is based 
on the version of x such as read by 7). However, in the meantime, 7) wrote x and might 
have created a new value for it. Further, r) [x] read the value of x written by Ti. This means 
that m\ A refers to another value of x than r|[x], although this should not be the case. Still 
77] is serial. If the method call that caused m\ 4 had not been a cache hit but had been 
executed normally, it would have read x by some operation Hjfx]. And this value would 
have been the value written by 7i. 

The conventional definition for serializable 1-version histories is based on the serializ¬ 
ability of serial histories. Unfortunately as just seen, this approach is not applicable to 
MC-histories. Then what is a good definition of serializability for MC-histories? As a so¬ 
lution we will interpret MC-histories as multiversion histories by means of an embedding 
function MV. MV maps all operations of an MC-history to one or more multiversion oper¬ 
ations. This way MV produces a multiversion history that exactly reflects all the conflicts 
that exist for 77. 
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Let us begin with an example to convey these intentions. Assume H\ from above is 
mapped to the following multiversion history: 

MV {Hi ) = r\ [y 0 K [x 0 ]ci w 2 [x 2 ] c 2 r\ [yo] r\ [*oH [x 2 ] c 2 . 

The original operations rf [x] r\ [y] are mapped to r\ [yo]rj [xo] where yo and xo state the ver¬ 
sions that these operations read, m *' 4 is mapped to r 3 \yo]r 2 [xo] since it essentially accesses 
the same versions of x and y as the read operations to which it refers in H\ (namely r\ [x] 
and r\ [y]). The superscript for r{ [yo] and r\ [xo] has been chosen more or less arbitrarily - 
because of r|[x 2 ], it must not equal 5. (The superscript is only required for conformance 
with Definition 6 .) Finally w 2 [x 2 ] just writes a respective new version of x and relates to 
w 2 [x] from H\. 

In the following, we will generalize the interpretation function MV. Thus we can define 
an MC-history H to be serializable if and only if MV (H)’ s multiversion serialization graph 
is acyclic for a write version order. E.g. MV{Hi)’s multiversion serializability graph is 
cyclic for the version order xo <C x 2 . It contains the version order edges 7j —*■ T 2 (due to 
r\ [xo] and w 2 [x 2 ]), T 2 —> T 2 (due to ^3 [xo] and w 2 [x 2 [) as well as the edge T 2 —*■ T 2 (due to 
w 2 [x 2 ] and r* [x 2 ]). This suits our intuition not to consider Hi as serializable. 

For MV it is crucial that it maps all conflicts of an MC-history H to H’s multiversion 
image. Otherwise one might obtain a multiversion history MV{H) that is 1-serializable 
although its origin H should not be considered serializable. The resulting formalism for 
MC-histories would then lead to serialization protocols that do not create truly serializable 
histories. E.g. the history 

Hi = [y] r 1 [x] c 1 W 2 [x] C 2 »i 3 ’ 4 VV 3 [x] C 3 

should not be considered serializable for similar reasons as Hi. However, a naive mapping 
of H] like 

r\ [yo] r 4 [x 0 ]ci W 2 [x 2 ]c 2 r] [y 0 ]r] [xo]w3 [x3]c3 

is 1 -serializable but ignores the conflict w 2 [x] W 3 [x] in H 2 because the respective oper¬ 

ations w 2 [x 2 ] and W 3 [x 3 ] do not conflict. So MV has to be defined in way such that this 
conflict is reflected in MV {H 2 ). An appropriate definition of MV results in: MV (H{) = 

A bo]4[xo]ciw 2 [x 2 ]c 2 r][yo]^[x 0 ] ^ c 3- 

Here, the operation [x 2 ] has been introduced to ensure that the set of conflicts in respect to 
transactions from H 2 and MV ( Hi) remain identical. The next definition states the general 
structure of MV. 

DEFINITION 9. Let H be an MC-history with the transactions T = {7j,...,7’„}. The 
function 

V:H^{l,...,n},V(p)»k 

shall return the index k of the last write operation Wk[x] £ H before p such that Ck £ H. If 
no such Wjt[x] exists, V (p) shall be zero, so V (/?) = 0. Further, the function 


ss : N x N x T —> N, (/, 7 ,7}) 1 —> h 
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shall return a unique number for an argument ( i,j , Tj) such that h {&|r^[jc] £ 7]}. 5 

The interpretation function MV is then defined be means of an auxiliary function mv 
with 


M) = mv{wi[x}) 


{Wi[xi ]} if 3r k i [x] £ 77 : rf [x] < W;[x] 
{Wi [Xi\ , ij [xv(w,-M)]} Otherwise, 


= {^[xy^Wq = rfyc] £ FI Ah = ss(kj ,7])} andMV (77) = U feHmv(p). 

The partial ordering relation <! for MV (77) is inherited from H’s partial ordering rela¬ 
tion <, wore specifically: p <' q 


(mv '(/?) < mv l (q) V ({/?,<?} C mv(mj’ k ) A p = rj[x s \Aq = r k j[y t ) Arj[x] < rj[y})). 

The latter part of the definition of <' deals with ordering read operations that replace 
method operations. MV produces a well formed multiversion history according to Defi¬ 
nition 6. The next theorem shows that for an rw-history 77, MV produces a multiversion 
history with (practically) the same serialization graph as 77. 


THEOREM 1. Let 77 be an rw-history. Further, SG*(H) shall be the transitive clo¬ 
sure of the 1-version serializability graph of 77 (according to [Bernstein et al. 1987]) 
and MVSG*(MV(FI)) shall be the transitive closure of the multiversion serializability 
graph of MV (FI) with some write version order. Then, the two graphs are identical, so 
SG*(H)=MVSG*(MV(H)). 

Proof. Obviously, Vi £ {1,... ,nj : c, £ 7} 4A c,- £ mv(Tj) holds. This means that con¬ 
ditions for graph edges that request participating transactions to be committed do not have 
to be considered any further for this proof. 

”C”: Let Tj ■ Tj be in SG. Then, there are operations p £ 1], q £ Tj with p < q, p f q 
and i j. Moreover, there is an x with {x} = d(p) ft d(q). 

If a(p) = r,a(q) = w one has got r*[x s ] <! Wj[xj\ in MV(H) (for some s). Thus, w s [x] < 
Wj[x ] must hold and so x s <C Xj. This leads to the version order edge 1] —> Tj £ MVSG. If 
a(p) = w,a(q) = r , one has got Wi[xj\ <' r k [x s ] in MV(H) (for some s) with the following 
two options for w s [xj: Either one obtains the trivial case i = s or w, [x] < Wj [x]. w s [x] < w, [x] 
cannot hold because it would lead to w;|x;] <' w^xj and so r k [x,-] because in Definition 9 
the index i is determined by V (contradiction). Since c s £H (according to the definition of 
V), T s —> Tj is in MVSG. As one will see as part of the next case, w, [x] < w s [x] implies the 
edge f -> T s £ MVSG* and so 7] -> Tj £ MVSG*. 

Finally, consider a(p) = w,a(q) = w: Let w, [x] = w^ [x] < ... < Wk n [x] = Wj[x\ be the 
sequence of all write operations in 77 in respect to x between w,- [x] and vc ; - [x] such that n > 2 
and Ck D £ T 0 for all o £ { 1,..., n}. Next we prove that there is a path l] :i —> l) :i/ £ MV SG* by 
induction on n. n = 2: For this case mv(wk 2 [x]) = {w^fx^],^^,]} due to the definition 
of V and also w^ { [x^J <' r^ [x^J. Thus, Tk x T^ £ MVSG. n — lrxn: The argument is 
analogous to the case n = 2. The only difference is to replace k\ by k n _ | and ki by k n . 

”D”: Let 7] —> Tj be in MVSG. Tj Tj can be a version order edge or an edge due to 
Wj[xj\ <’ r h j[xi\ with i j. In particular the case Wj[xj\ <’ r*[x/] with i ^ l (from the first 
disjunctive clause of Definition 8 ) can be excluded because of mv’ s Definition. 


5 The specific structure of ss is not of interest. Below, it is just required to produce unique superscripts for read 
operations in respect to a transaction 7J. 
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Consider the case w;[x i] <'#* i}: According to the definition of mv one has got w,-[x] < 
r'j[x] (if mv(rj[x\) = {r h j[xk}} for some k ) or w,-[x] < Wj[x] (if mv(wj[x\) = {r h j[xk\,wj[xj}} 
for some k). So T, —> 7) G SG. r h j [x,-] cannot be in the range of a method operation because 
77 is an rvv-history. 

If 7] —> Tj&MVSG is a version order edge one has got two cases. Case 1: w,[x,],r?[x/] G 
MV ( H ) (for some k) with x ( - <C xj. ij^O holds because of w; [x ( ] and because <C is a version 
order. Thus, j > 0, which implies that a w, [xj\ exists in MV (77). Since <C is a write version 
order, w, [x] < vv ; [xj follows and further, 7] -> 7) G SG follows. 

Case 2: One has got two operations rjj[xj , wy[xy] G A7V(77) (for some k) with x* -C Xj. 
There two are subordinate cases, namely [x*] <’ Wj [xj\ and wj [xj] <’ r ; - ! [x*]. (The two 
operations can be compared by means of <', since their preimages p,q G 77 in respect 
to mv must be conflicting and so p < q or q < p, but this relationship is maintained by 
<’.) Consider [x^] < w ; [x ; ] first. Then, ^[xj G rav(rj'[x]) or r] 1 [xj G mv[wi[x\) and 
r('[x] < Wj[x\ respectively w, [x] < Wj[x\ follows. So Tj —> Tj G SG. (»»iv _1 (?f [x^.]) cannot be 
a method operation because 77 is an rvv-history.) Secondly, consider w ; [x ; ] <' r, [x/;]. Due 
to the definition of V, k cannot be zero and so, with x& xj one obtains Wk[x\ < w ; [x]. 
If rj[xk] G mv[rj[x]) holds, it follows that Wk[x] < wj[x\ < r,[x] which implies T(rf[x]) 

This is a contradiction to r, [x^] Gmv(r,[x]). Finally, if r,-[xj.] G otv(vv, [x]) one obtains vv^fx] < 
vvyfx] < w, [x] and thus V(w, [x]) k. However, this also contradicts r, [x^] G wv(w, [x]). The 
previous considerations have covered all cases for edges 7] > Tj G MVSG. □ 

Using Definition 9 one can interpret MC-histories as ordinary multi-version histories. 
However, an MC-history does not exhibit the same complexity as its underlying multi¬ 
version history. (E.g. MC-histories without m-operations may be considered as ordinary 
one version-histories.) Therefore the introduction of /n-operations greatly simplifies the 
development ///-scheduler protocols. 

Theorem 1 stated that the chosen interpretation function MV is appropriate when applied 
to an rvv-history 77, since MV (77) essentially holds the same serializability graph as 77. 
Moreover, MV interprets an /72-operation as a set of read operations accessing just the 
versions of data elements which were used when the respective cached method result was 
first computed. These facts justify the following definition of serializable MC-histories. 

Definition 10. An MC-history 77 is MC-serializable iff MV SG(MV (77)) is acyclic in 
respect to some write version order. 

E.g. Hi and 7/; from above are not MC-serializable because the corresponding multiver¬ 
sion serializabilty graph is cyclic (and xo <C X 2 matches the write version order predicate). 

3.4 Serializability Theorem for MC-Histories 

Using Definition 10 one can decide whether an MC-history 77 is MC-serializable by com¬ 
puting MV (77) and then checking the resulting history’s multiversion serializability graph 
for cycles. Clearly, it would be more convenient if we had a serializability theorem which 
applies right to 77 instead of MV (77). The next definition states how a respective graph 
should be constructed for 77. 
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DEFINITION 11. Let 77 be an MC-history for the transactions {T\,...,T n }. The MC- 
serializability graph MCSG C { 7^,..., T„} 2 for H is given by the following predicate: 

(: Tj , Tj) £ MCSG :<=>Ci£TiA cj £ Tj A 
£ Tj : 3q £ Tj : a{p) m A a(q) ^ m Ap Iff q Ap < q) V 
(3m. J , W j[x\,r' k [x] £ II : r[[x] <wj[x\ A («V 7 'Vw ; -[x] <m k ’ 1 )) V 
O' ± j A 3w; [x], m k f l , r[ [x] G 77 MC : w t [x] < r' k [x])) . 

Instead of (f, Tj) £ MCSG we simply write 7) —> Tj. 

Consider H\ from above. Its MC-serializability graph consist of T\ —> T 2 (due to [x] f 
W 2 [x]), Tn —» Tj (due to W 2 [x] jj" [x]) and Tj, —> 7i (due to W 2 [x] )\m l f 4 ). These are the same 
edges as in MVSG(MV{H\)) (with xo <SCx 2 ). This observation gives rise to proving the 
serializability theorem for MC-histories which is stated next. 

THEOREM 2. Let H be an MC-history. MCSG*(H ) shall be the transitive closure of 
its MC-serializability graph of H and MVSG* (MV(H)) shall be the transitive closure of 
the multiversion serializability graph ofMVSG(H) in respect to some write version order. 
Then, the two graphs are identical, so MCSG*(H ) = MVSG*(MV(H )). 

PROOF. Just as for the proof of Theorem 1, conditions for graph edges that request 
participating transactions to be committed do not have to be considered any further. 

”C”: Let Tj —> Tj be in MCSG. Due to the first disjunctive clause of Definition 11 
SG(RW(H )) C MCSG(H ) holds. (Just compare the first disjunctive clause of Definition 
11 with the definition of SG from [Bernstein et al. 1987].) So, if 7] —> Tj £ SG(RW(H)) 
then f -> Tj £ MV SG* (MV (RW (H))) C MVSG* (MV (H)). (This follows from Theorem 
1 .) 

Now, let f —► Tj be in MCSG(H) \ SG(RW (7/)). 7] —► Tj can only exist because of the 
second or the third disjunctive clause of Definition 11. This means that there are either 
operations nip , w ; [x], r([x] with r l k [x\ < Wj[x\ or operations w;[x], m k f 1 , r[[x] with w,[x] < 



For the first case, consider the image in respect to mv: mv(r[[x]) = [x^-]}, wy[xj] £ 
mv'(wj [x]) and [xj] £ mv'(rnf l ) (for some s). With r ! k [x s ] <’ Wj[xj] it turns out that s = 
0 V Wj[x,s] <' Wj\xj\ and one gets the version order x s <C xj. For this case i j and the 
operations r l ‘ [x. v ] and vv;[x ; ] result in the version order edge 7) > Tj £ MVSG (see last 

disjunctive clause of Definition 8). If otherwise i = j holds, it follows that Wj\xj\ = w;[x/] <' 
rj [xj for the second disjunctive clause of Definition 11. Since i =7 s, one obtains 7] —► Tj = 
Tj £ MVSG because of the first disjunctive clause of Definition 8. 

If there are operations w*[x], m k ’ 1 , r[[x] with w,[x] < r([x\ that cause 7] —> Tj £ MCSG, 
then their images in respect to mv behave as follows: w,[x,] <’ rifxs] <' r h - [x, s ] (for some 
s). The case i = s is trivial. Otherwise one can conclude by induction as in the proof of 
Theorem 1 that 7} —► T s £ MVSG* with vv A [x. s ] £ T s . Thus 7) —> Tj is in MVSG*. (Note that 
s > 0 because of V ’s definition and because of vv,-[x,-].) 

”D”: Let T —> 7} be in MVSG(MV(H )). Theorem 1 has already considered all edges 
that relate to conflicts between read and write operations but not method operations. There¬ 
fore, it suffices to analyze edges in MVSG that are cause by the additional images of 
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method operations in respect to mv. So, let r h n [x s ] £ mv'(m k n l ) and w t [x t \ be operations 
that causes a respective edge 7] —> 7) € MVSG. According to Definition 8 one has to 
distinguish for cases: i = j = n = t,i A s,Wi[xj\ <’ r,-[x s ] or i A j,i = s = t,j = n or 
i A 1 j,i =t,j = s,x ; <C x 5 or i A j,n = i,t = j,x s «r t . 

In the first case, one has got operations r[ [x] < w ; - [x] < m kl or w;[x] < w 0 [x] < A k [x] < m k ’ 1 , 
since otherwise i = s would hold. A k [x] < w, [x] < m'A results in the edge 7] —> 7} G MCSG 
with i = y from the second disjunctive clause of Definition 11. w, [x] < w 0 [x] < r[ [x] < m\ J 
results in 7} —> Tj, —> 7} € MCSG. 

The second case leads to w,[x,] <' r*[x,] G mv{m k j l ) with w,-[x,-] £ mv(w,-[x]). Therefore, 
there exists an [x] with w; [x] < r l k [x] in 77. If r[[x] < wi [x] would hold, applying mv would 
return mv(r' k [x\) = {^[x^]} for some g ^ i. This would lead to rj[x g ] £ mv(m k :’ 1 ) instead of 

rj[xj\ £ mv(m k j l ) (contradiction). Thus, T[ —> Tj £ MCSG follows from the last disjunctive 
clause of Definition 11. 

Considering the case i ^ j,i = t,j = s,x r <C x.^ Here, w,[x] < Wj[x] follows right away 
because is a write version order. (Note that t = i cannot be zero.) 

The last case creates the situation x s <C xj, rf[x s ] £ mv(m k]l ) and wj[xj] £ mv(wj[x\) 


with Wj[x] £ 77. Moreover, due to m k ’ 1 , there must be a r[[x] £ H with r l ,[x] < m kJ . If 
r[[x] < Wj[x\ holds, one obtains 7j —> Tj for the second disjunctive clause of Definition 
11. Now consider Wj[x\ < r[ [x]: If r[[x] reads from Tj, applying mv results in w ; [x ; ] <' 
r[[xj\ <’ r'l[xj\ £ mv{m k ’ 1 ) and so j = s but this is a contraction to x s -C Xj. Otherwise 
A [x] reads x from a T a A Tj and one has got wy[x] < w 0 [x] < A, [x]. Applying mv results 
in Wj[xj\ <’ w 0 [x 0 ] <' A k [x 0 ] <' A- [x 0 \ £ mv{m k A). Thus, s = o and finally xj <C x s follows 
(because of vv ; [x ; ] < w 0 [x 0 ]). However, this contradicts the case’s precondition. □ 


Given an MC-history 77, Theorem 2 confirms that the transitive closure of 77’s MC- 
serializability graph is identical to the transitive closure of MV (77) ’s multiversion serializ- 
ability graph. Since a transitive closure does neither add nor remove graph cycles, we can 
indeed rely on Definition 11 to check for MC-serializability. 


4. RECOVERY FOR MC-HISTORIES 

Before developing a serializability protocol for transactional method caching, we want to 
address the simpler task of creating a recovery protocol. In this respect, we are interested 
in applying conventional recovery qualities such as ’’recoverable” or ’’strict”. Again, the 
definition of these qualities must be adapted to the structure of MC-histories. This section 
defines the corresponding qualities and gives a lemma on which an ///-schedulers recovery 
protocol can be based. The second part of this section discusses the protocol’s implemen¬ 
tation. 

4.1 Formalism 

DEFINITION 12. Let H be an MC-history with the transactions T = {7j,... ,T n }. A 
transaction Tj £ T reads (a data element) x from Tj £ T via an operation p £ Tj iff: 

3r£ [x], Wj [x] £H : Wj [x] < r ; ) [x] A ( h = iV mp k £ 77) A -> (ay < rfc [x]) A 
Vw 0 [x] £ H : wj [x] < wo [x] < r£[x] => a 0 < A h [x}. 
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We have p = r^\x\, if h = i holds for the given predicate and p = m-' k otherwise. The 
relationship between 7j, x, Tj and p is expressed by reads(Tj,x,Tj,p). reads forms the so 
called reads-from-relation. 

For the MC-history 

H 4 = w 2 [x]rj [y]rj [x]cic 2 m l 3 ’ l W 3 [x]c 3 

we have reads = {( Ti,x,T 2 ,r\[x]),(Ti,x,T 2 ,m 1 f *)}. Using the reads-from-relation, most 
conventional recovery qualities can also be applied to MC-histories. 

DEFINITION 13. An MC-liistory H with the transactions T = {/j, ... , T n } and the data 
elements D is recoverable respectively ACA (avoiding cascading aborts) respectively strict, 
iff the following qualities hold: 

— recoverable: 

Vi, 7 ' £ {1,: Vx £ D : Mp £ H : (i ^ j Areads(Tj,x,Tj,p) Ac, £ //) =>■ Cj < c,-, 

~ ACA: 

Vi ,7 G {1,... ,n} : Vx £D : Vp £H : (i ^ j Areads(Tj,x,Tj,p)) => Cj < p, 

— strict: H is ACA and 

Vw;[x],Wj[x] £H : [i^f j Awj[x\ < Wi[x]) => (aj < wfx] V Cj < w,[x]). 

Obviously, the standard inclusion statement ’’strict C ACA C recoverable” also is true 
for MC-histories. The four MC-histories //g to If, which are presented next, only differ in 
respect to the placement of c\ but: i /5 is not recoverable. If, is recoverable but not ACA, 
Hi is ACA but not strict, If is strict. 

H 5 = w 1 [x]wi [y]w 2 [y]r 2 [x]/n 3 ’ 1 C 3 CiC 2 , 

H 6 = w 1 [x]wi [y]w 2 [y]r 2 [x] 7 H 3 ’ 1 ciC 3 C 2 , 

Hi = W 1 [x]w 1 [y] vv 2 [y] c l r 2 W ,H 3 1<: 3 C 2; 

= w 1 [x]wi \y\c\W2\y\r\[x}m 2 f l c?,C2. 

The next lemma states how an m-schcduler can ensure that together with the rvv-schedulcr, 
it produces ACA MC-histories. By requesting an MC-history’s rvv-projection to be ACA 
the lemma assumes that the rvv-schedulcr will already provide ACA rw-histories. Given 
that the /n-schcduler guarantees an additional predicate, the joint MC-history will be ACA 
too. 

LEMMA 1 . Let H be an MC-history for the transactions T = {7j,... , T n } and let the 
following predicate hold: 

Vf £ T : Vx G D : reads(Ti,x, Tj,r\ [x]) => Vm^ £H : i ^ j => c* < mj. 

Then, H is ACA iffRW(H) is ACA. 

PROOF. Let H be ACA. Since RW (H) C H holds, the reads-from-relation of 

RW(H) is a subset of //’s reads-from-relation. Therefore RW(H) is also ACA. 

”<f=”: Let RW{H) be ACA. Thus, in respect to H only the additional method opera¬ 
tions might violate ACA. Let mf l £ H be such a method operation that reads from 7} via 
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Wi[x], so reads(Tj,x,Ti,m k j l ) holds. Due to Definition 12 there must also be an r l k [x\ with 
reads(Tk,x, 7 ). [x]). Further, r l k [x\ < m- 1 must hold because of Definition 2. If k ^ i then 

Ci < r l k \x] < m k j l follows, since RW (77) is ACA. Otherwise, one obtains reads(Ti,x , 7) . r\ [x]) 
and so a < m k[l if i / j due to the Lemma’s predicate. In either case H is ACA. □ 

The next MC-history shows that the predicate of Lemma 1 is necessary: 

Hg = wi [x]rj [xj/Wj 1 c l c 2 

is not ACA because of reads(T 2 ,x , T\ . m^) and m]) 1 < c\. However, RW (Hg) = wi [x]rj [x] 
C 1 C 2 is ACA. 

As the following example shows. Lemma 1 cannot be rephrased for MC-histories that 
are just recoverable: 

H l0 = w i [x]r| [x]m 3 ’ 1 c 3 cic 2 

is not recoverable, since the relation reads(Tj,,x, T\,m j' 1 ) holds and c 3 < c\. Still, RW (H io) 
is recoverable. 

If one wants MC-histories to be strict and not just ACA, it suffices to keep the predicate 
from Lemma 1 and to expect the rw-scheduler to produce strict rw-histories: 

Lemma 2. An MC-history H is strict ijfH is ACA and RW ( H ) is strict. 

PROOF. Let H be strict. Since RW does not remove any commit or abort opera¬ 

tions RW ( H ) must be strict too. 

Let H be ACA and RW ( H ) be strict. In respect to H only method operations must 
be checked. However, additional method operations do not impact the strictness predicate 
for write operations from Definition 13. □ 

4.2 Implementation 

We now describe a simple protocol that produces ACA respectively strict MC-histories 
given that the nv-scheduler creates ACA respectively strict rvv-hi stories. As stated by 
Lemma 2 and 1 the m-scheduler’s job is just to guarantee the predicate of Lemma 1. Sur¬ 
prisingly, this can be done entirely on the client side of a related system: For every trans¬ 
action Ti started at the client, the method cache keeps a flag which indicates whether or not 
there has already occurred a write method call inside 7], (For a new transaction the flag is 
false, meaning no write method call has occurred yet.) After the first write method call of 
T, every new method call result r which is computed inside T, and stored in the method 
cache, remains locked client until 7} ends. The lock prevents concurrent transactions from 
producing a cache hit on r before 7) ends. At 7}’s commit, the lock is removed and other 
transactions may access r. However, if 7j aborts, then r is entirely removed from the cache. 

The protocol is correct because reads(Ti,x , 7), r\ [x]) from the predicate of Lemma 1 can 
only hold, if some write operation has ever occurred in 7). When this happens, the lock on 
new cached method results produced by 7’ prevents other transactions from reading those 
cached method results before 7j has committed. 

5. OPTIMISTIC CACHING TIMESTAMP PROTOCOL 
5.1 Formalism 

This section presents an optimstic caching timestamp protocol (OCTP) for scheduling 
method operations as part of MC-histories. An m-schcduler that applies this protocol can 
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be integrated with an nv-schcduler that follows a timestamp protocol itself but also with 
a strict two-phase lock protocol. An integration with a strict two-phase lock protocol is 
possible by interpreting the rw-scheduler’s commit order as a timestamp order. ([Bernstein 
et al. 1987] showed that this is legitimate.) 

Apart from the protocol presented next, we have developed another serialization protocol 
for an m-scheduler whose essential idea is related to the one of OCC from [Adya et al. 
1995]. For a more compact contribution we do not present this protocol. We prefer to 
present OCTP mainly because it is a strong improvement over the OCC-like protocol: It 
accepts a superset of histories that the OCC-like protocol accepts 6 and it causes much 
lower transaction abortion rates. The latter statement is substantiated by the experiments 
from Section 6. 7 As opposed to the OCC-like protocol, the correctness of OCTP is not 
straight forward to see. We will have to make good use of the formalism from Section 3 to 
prove it correct. 

The fundamental concept of timestamp protocols are timestamps. For clearity and com¬ 
pleteness we define them next. 

Definition 14. Let H be an MC-history with the transactions T = {7i,... ,7],}. ts : 
{Ti ,..., 7),} —> N is a timestamp function iff 

Vi ,7 G {l,...,n}:ts(Ti)=ts(Tj) i = j. 

For conventional timestamp protocols conflicting operations should be ordered along the 
timestamp order of the transactions to which they belong. 

DEFINITION 15. Let H be an MC-history with the transactions T = {7),..., T n }. H is 
f-ordered in respect to a timestamp function ts iff 

Vp,q G77 : ViJ G {1,...,«} : (p G TjAq G Tj Ap Iff q Ats(Ti) <ts(Tj)) => 

(at G H V aj G H V p < q). 

It is well known and easy to prove that f-ordered rvv-histories are serializable. The 
reason for this is that conflicting read and write operations dictate the direction of edges 
in a respective serializability graph. However, for a method operation that conflicts with 
a write operation the direction of a respective edge in the MC-serializability graph does 
not necessarily depend on the two operation’s order. E.g. H\ from above is f-ordered for 
the timestamp function fs(7]) = i but the operations r\\x\ < W 2 [x] < ml,' 4 produce an edge 
73 —* 7T Therefore the timestamp rule does not guarantee MC-serializability. 

In the following, an edge Tj —> 7] is called a reverse edge, if and only if it is produced 
by two conflicting operations p G 7) and q G 7) with fs(7]) < ts(Tj) . Otherwise we call it 
a normal edge. 

Interestingly, if an MC-history 77 is f-ordered, 7/’s reverse edges can only be created by 
the condition 3m k /, Wj [x], r l k [x] G 77 : (; r l k [x] < Wj [x] A (i 7 ^ j V Wj [x] < m \’ 1 )) from Definition 
11. This implies that the read operation r l k [x] to which m 4 refers must have occurred before 
Wj[x\. 

One way to develop a timestamp protocol for MC-histories would be to entirely forbid 
reverse edges . 8 But we can go a more general way and trade off reverse edges against 

6 This can be proven. 

7 The effect can also by explained analytically but this is beyond the scope of this paper. 

8 This approach leads to the OCC-like protocol mentioned at the beginning of this section. 
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Fig. 5. An MC-Serializability Graph to Illustrate the Idea behind OCTP 
normal graph edges! To illustrate this idea, consider the following prefix of H\ : 

4 bK [r ] c i w 2 [r] c 2 m 3 ;4 • 

When ;h * ' 4 is scheduled it produces the reverse edge 7) - - 7 2 . So afterwards the scheduler’s 
duty should be to avoid edges from T 2 to 73 . At the point of time when the //(-scheduler 
accepts m^’ 4 , 73 has still a (good) chance to commit. However if we forbade reverse edges 
entirely, the //(-scheduler would have to reject m\ A and thus abort T 3 right away. 

In general, the following rule should hold: If the ///-scheduler accepts a method operation 
producing a reverse edge 7] —> Tj then it should ensure that there are no edges 7), —> 7j with 
ts(Tj) < ts(Tj t ). As an example, suppose the graph from Figure 5 was an MC-serializability 
graph with the timestamp function fs(7j) = i. The dotted arrows then represent reverse 
edges. According to the stated rule, the graph edge 7s —> T(, must be excluded because of 
the reverse edge T(, —► Tj. Similarly, 7) —> 7) contradicts the rule due to the reverse edge 
T 2 —» Tj. But how about 7) —> 7),? It adheres to the stated rule and still leads to a graph 
cycle. Apparently, it does not suffice to consider single reverse edges. Instead, one has to 
consider paths of reverse edges. In Figure 5 a path of reverse edges starting from 76 leads 
back to 72. Therefore, no transactions with ts(J)) > tstJY) should point to T(,. 

The function ts/u (Tf which is defined next, computes the minimum timestamp of all 
those transactions that can be reached from transaction 7) via paths consisting exclusively 
of reverse edges. The computation is based on the operation order of an underlying MC- 
history (prefix) and can be performed dynamically by the ///-scheduler. The function forms 
the basis of a respective serializability protocol. 

DEFINITION 16. Let H he an MC-history with the transactions , T n } and a time- 

stamp function ts. The fitting timestamp function 

tsfit : {Ti | /' G {1,... ,n} Aci £ 7)} —> N 

is computed as follows: 

ts f it{Ti) = min ({fs(7})} U 

{ tSf it (Tj) | 3wj[x],m k f l ,r{[x] & H:r l k [x}< wj[x} Ats(Tj ) < Ts(T-) Ac j £ H }). 

Lemma 3. tsjn is well defined. 

PROOF. Consider ts/a(Ti) according to Definition 16. The argument of min(...) is a 
non-empty set, since it contains fs(7j). Further, every 7) referenced by the set {tSjifTj) | ...} 
from above has committed (so Cj £ Tj) and lies in the domain of tspn. For every Tj refer¬ 
enced by { ts/it(Tj ) j ...} we have ts(Tj) < ts(Tj). Since there are at most n timestamps in 
the range of ts, the computation of ts/i t (7j) terminates. Q 
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Using tsfit we can define the quality ”f-fitting” for MC-histories, which formalizes the 
generalized rule for reverse edges from above. 

DEFINITION 17. Let H be an MC-history with the transactions {T\,...,T n }, a time- 
stamp function ts and the MC-serialization graph MCSG. 77 is f-fitting in respect to ts 

iff 

Vi,j € {1,... ,n} : (7i -► Tj £ MCSG Ats(T$ < ts(Tj )) =► ts(Tf) < ts fit (Tj). 

Unfortunately, f-fitting MC-histories with f-ordered rvv-projections don’t have to be MC- 
serializable. We need two additional qualities to prove a respective theorem. ’’Irreflexive” 
avoids edges 7} - - 7) in an MC-serializability graph. For an operation sequence of the 
kind Wi[x] < r l k [x\ < m k f l ”rm-ordered” ensures that ts(Tf) < ts(Tf) holds, if 7} and 7} com¬ 
mit. Luckily, both qualities are uncritical when realizing a corresponding serializability 
protocol. 

Definition 18. An MC-history 77 is irreflexive iff 

3 Wi[x],m k ' 1 ,r l k [x\ G H : r[[x] < w,[x] < mf a, £ 77. 

Consider a client transaction 7} which causes a write operation vv, [x] at the server. The 
base protocol from Section 2.2 causes cached method results to be removed from the 
client’s cache right before the method invocation causing vv,-[x] returns control to the client 
code. Therefore a cache hit corresponding to m k ' ! with w;[x] < mf cannot happen and the 
base protocol ascertains implicitly ’’irreflexive”. 

DEFINITION 19. An MC-history H is rm- ordered in respect to a timestamp function ts 

iff 

Vi, j € {1,..., n} : {3wi [x], mf , r[ [x] £ H : w; [x] < r[ [x] < m k /) => 

(at £ H mc V aj £ H MC \Jts{Ti) < ts{Tj)). 

As we will see below, an MC-history is implicitly rm-ordered if the /^-scheduler coop¬ 
erates with an rw-scheduler that applies a strict two-phase lock protocol. The next theorem 
forms the basis of an ///-schedulers implementation of OCTP. It expects the nv-scheduler 
to provide f-ordered rvi-histories. 

THEOREM 3. An irreflexive MC-history H which is t-fitting and rm-ordered in respect 
to a timestamp function ts is MC-serializable ifRW(H) is t-ordered in respect to ts. 

PROOF. Assume 77’s MC-serialization graph MCSG was cyclic. A cycle in MCSG has 
at least a length of 2, because for all disjunctive clauses from Definition 11 but the case 
r[.[x] < w,[x] < mf , i j holds for a corresponding edge 7} —> Tj. However, the case 
r l k [x\ < vv,[x] < mf 1 is excluded because H is irreflexive. A cycle (with two or more nodes) 
in MCSG consists of at least one reverse edge. Otherwise one would obtain a cycle 7* —* 
... —¥ Tk with normal edges only and so ts(Tk) < ts(7\) would hold (contradiction). 

The following considerations reveal that for a reverse edge 7) —> Tj one has got oper¬ 
ations r[[x] < Wj[x] and tny with ts(Tf) < ts(Tj) from the second disjunctive clause of 
Definition 11. Edges from the first disjunctive clause of Definition 11 cannot be reverse 
edges because the related operations must not be method operations, but RW (77) is ex¬ 
pected to be f-ordered. If an edge from the third disjunctive clause of Definition 11 was a 
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reverse edge, then one would have operations w,[x] < r l k [x] < m- [ with ts(Tj ) < ts(Tj). Yet, 
this contradicts 77’s quality to be /-//(-ordered. 

Now, let C be a cycle in MCSG and 7 \ be the node in C with the smallest timestamp. 
There must be a reverse edge 7), —> 7J. £ C for some 7), because otherwise 7).’s timestamp 
would not be minimal in respect to C. Further, let 7} —>... —> 7). be the longest acyclic path 
in C consisting entirely of reverse edges. Then, there must be an edge 7} —► 7) £ C which 
is a normal edge. Otherwise C would consist of reverse edges only and one would obtain 
C = Tk Tk with ts(Tk) < ts{Tk ) (contradiction). 

Since 7] —> 7) is not a reverse edge, one has got ts(7J) < ts(Tj) and even fs(7}) < ts/i t (Tj), 
due to H being f-fitting. Since 7) Tk only consists of reverse edges, an inductive 

application of Definition 16 results in tSfj t (Tj ) < ts(Tk). This leads to fs(7j) < fi(7i) and 
contradicts the assumption that 7&’s timestamp is minimal in C. Thus MCSG must be 
acyclic. □ 

5.2 Implementation 

This section characterizes a serializability protocol for an ///-scheduler which is derived 
from Theorem 3. We assume that the rw-scheduler applies a strict two-phase lock protocol 
since this protocol is common for commercial database management systems . 9 

As mentioned at the beginning Section 5, in case of a strict two-phase lock protocol, the 
commit order of rw-transactions may be considered a timestamp order. More specifically, 
the timestamp function is implicitly given by ts{T{) < ts(Tj ) :<f=> c,- < cj. (As we will see, 
aborted transactions are not of interest.) 

Since the corresponding rw-histories are strict, the situation vv,-[ jc] < r l k [x\ < rrt^ leads 

to Wj[x] < Ci < r[[x] < m'j ' 1 < Cj and so ts(Tj) < ts(Tj) holds due to the chosen timestamp 
function. Hence, the quality ’’/-//(-ordered” is automatically guaranteed. For serializability 
the //(-scheduler only needs to ensure ’7-fitting”. 

Figure 6 captures a respective implementation using Java pseudo code and forms an 
extension of the base protocol’s pseudo code from Figure 4. For simplicity, it assumes 
that the //(-scheduler is notified of transactional operations by calls to the methods read (), 
write (), commit () and abort (). The method methodOp () handles //(-operations and is 
called by handleReqest () from Figure 4. Except for abort (), the methods do not impact 
the systems’s normal transaction management process but only observe it. However, a 
call to abort () is assumed to abort the client-side transaction as well as related resource 
manager transactions. 

For the //(-scheduler to work properly, it is required that an underlying resource manager 
processes read, write, commit and abort operations in the same order as they are observed 
by the //(-scheduler. Further, all those operations must pass the ///-scheduler. The imple¬ 
mentation does not yet account for memory management but in fact, all of the code’s data 
structures can be handled in a way such that their size remains limited. At the end of this 
section we will explain how this can be realized. 

Transactions are represented by instances of class T, whereby a transaction’s timestamp 
as well as its fitting timestamp are initially unknown. For that reason, T.ts and T.ts /u 
obtain the value °° when a respective transaction begins (Line 7). The lists rl, wl and ml 


'Allis is just some legitimate assumption for realizing a respective protocol - what matters most is that the rw- 
scheduler produces ^-ordered rw-histories according to a timestamp function whose ordering is known to the 
m-scheduler. 
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interface DE {} //Representation of a data element (just a marker interface) 
class Mid { int k, 1; } //ID of a stored method result of operations r[[x],r[[y],... 
class Op { boolean read; DE x; } 
class T { // Representation of a transaction 7] 
int id; //The transaction’s ID 

List<0p> 1 = 0; int nextMId = 0; // From Figure 4 

int ts = °°, t s fit - 00 ; // Timestamp and fitting timestamp for 7) 

int ts tol = 0; //Maximum timestamp of transactions producing normal edges to 7] 

Set<DE> rl = 0, wl = 0; //For storing data elements which are read respectively written by 7} 

Set<MId> ml = 0; // For storing 7]’s method operations as Mid-objects 

} 

class MScheduler { //Representation of the w-scheduler 
int nextTs = 1; // To create the next timestamp Ts 
Rel<DE, MId> V = 0; // Relates x with tuples ( k,l ) with x G d(m *’ / ) 

Rel<DE, T> rt = 0; // Relates x with T-objects representing 7}s such that rf [x] G 7} 

Rel<DE, T> wt = 0; // Relates x with T-objects representing 7)s such that w/[x] G 7} 

Rel<MId, DE> mt = 0; // Relates (k,l) with T-objects representing 7}s such that m]' 1 G 7} 

Map<int,int> txId2Ts = 0; //Relates a transaction’s ID with its timestamp 
synchronized void read(T t, DE x, int k) { // Perform /f[x] with t.id=z 
for each s E wt(x) if (checkTimestamps (s, t)) { abort (t); return; } // Handle rw-conflicts 
t. r 1. add (x) ; rt. add (x, t) ; // Update relations 
txId2Ts.put(t.id, °°); t.l.add(new Op(true, x)) ; 

} 

synchronized void write (T t, DE x) { // Perform w,-[x] with t. id= i 
for each s E wt(x) U rt(x) 

if (checkTimestamps (s, t) ) { abort (t); return; } // Handle ww- and wr-conflicts 
for each m E V(x) //Handle wm-conflicts in respect to’’/-fitting” 

for each s E mt (m) if (checkTimestamps (s, t)) { abort (t); return; } 
t. wl. add (x) ; wt. add (x, t) ; // Update relations 
t.l.add(new Op(true, x)); 

} 

synchronized void methodOp(T t, Mid m) { // Schedule m/ at the m-scheduler with t. id= i 
for each x E V -1 (m) //Handle mw-conflicts 
for each s E wt (x) if (s.ts < 00 ) { 

// Update t’s fitting timestamp if m)' 1 might cause a reverse edge 

if (s.ts > txId2Ts(m.k) && s.tsyfr < t.ts fu) t.tSfu = s.ts /n; 

II If mf might cause a normal edge, then update t. ts fo / 

if (s.ts <= txId2Ts(m.k) && s.ts > t.ts^ 0 /) t.tS/ 0 / = s.ts; 

if (t.ts tol >= t.ts fit) { abort (t) ; return; } } // Check invariant and abort at a violation 
t. ml. add (m); mt. add (m, t); // Update relations 

} 

synchronized commit (T t) { //Handle commit of t 
t.ts = nextTs++; txId2Ts .put (t. id, t.ts); //Create the timestamp 
if (t.ts fit == °°) t.ts pt = t.ts; //Adjust tsf it if necessary 
for each x E t.wl // Update fitting timestamps for active transactions 
for each m E V(x) 
for each s E mt(m) { 

if (s.ts == 00 && t.ts fu < s.ts fit) s.ts fit = t.ts fit) 
if (s.ts tol >= s.ts/jf) abort (s); } 

for each x E t.rl // rw-conflicts // Abort transactions violating ’’/-fitting” due to t’s timestamp 
for each s E wt(x) if (checkTimestamps(t, s)) abort(s); 
for each x E t.wl // ww- and wr-conflicts 

for each s E wt (x) U rt (x) if (checkTimestamps (t, s)) abort (s); 
for each m E t.ml //wm-conflicts 
for each x E V -1 (m) 

for each s E wt(x) if (checkTimestamps(t, s)) abort(s); 

} 

boolean checkTimestamps(T a, T b) { 

if (a.ts < 00 && b.ts == 00 && a.ts > b.ts to l) b.tSf 0 /=a.ts; 
return b.ts == 00 && b.ts^ 0 / >= b.ts/,>; 

} 

synchronized void abort (T t) { ... } //Abort t 


Fig. 6. Java Pseudo Code for ”f-fitting” at the m-Scheduler 
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(Lines 9, 10) store transaction operations in order to detect conflicts with other transac¬ 
tions. The lists are used at a transaction’s commit-time in order to find conflicts with active 
transactions (Lines 46 to 57). 

Let Tj =t be a transaction which is represented by an instance of T. The field t . ts ro / 
from Line 8 stores the largest timestamp of a committed transaction producing a normal 
edge which points to t. t .ts ro / is important to guarantee ’’(-fitting” throughout t’s life¬ 
time: While normal edges pointing to t may increase the value of t .ts ro /, reverse edges 
originating from t may decrease t .tsy, r dynamically due to new transactional operations. 
The ///-scheduler’s main task is to ascertain t. ts to i < t. tsp, until t commits. At a viola¬ 
tion of this invariant it aborts either t or it aborts the respective conflicting transaction. The 
method checkTimestamps () from Line 59 assists in updating t. ts ;o ; accordingly and in 
checking the stated invariant after the update. It is used my the methods read (), write () 
and commit (). 

The relation V associates data elements (instances of class DE) with cached method calls 
(Line 15). The latter ones are identified by Mid-objects according to the read operations 
by which the method result was computed. (This coincides with the description of V from 
Section 2.2 and Figure 4.) The purpose of the relations rt, wt and mt is to associate data 
elements respectively IDs of method results with transactions in which they were accessed 
(Lines 16 to 18). 

The methods read (), write () and methodOp () first check whether the intended oper¬ 
ation might violate the quality ’’(-fitting”. At a violation, they abort the current transaction. 
(Note that the pseudo code abstracts from the details of the abort process.) Otherwise, they 
update the //(-scheduler’s data structures. 

As an example of how the violation check works, consider the Line 21 of read (): Using 
wt the method binds each transaction that wrote the same data element as the current read 
operation to the local variable s. If the transaction (bound to) s has got a timestamp less 
than °° it must have committed and so if the current transaction t committed too, the 
read operation would result in a normal edge s—>t£ MCSG. So, in order to assert ”(- 
fitting” for t the expression s. ts < t. ts /,7 must hold and this is just checked in Line 21 
using checkTimestamps (). The arguments behind the checks of the method write () are 
similar (Lines 26 to 29). 

methodOp () observes a new //(-operation m( l of a transaction Tj =t and determines if 
the operation produces reverse or normal edges in respect to committed transactions. In 

order to do so, methodOp () loops over all data elements which are referenced by the m- 
k l 

operation mf (Line 34). If a committed transaction Tj =s has written one of those data 
elements, there is a conflict between 7/ and Tj. Further, if Tj’s timestamp is younger than 
ts(Tk), one obtains the situation r' k [x\ < Ck < Wj[x] < cj < which implies a reverse 
edge and so t.ts must potentially be updated (Line 37). Using the map txId2Ts the 
//(-scheduler fetches the timestamp (s(7jt) in respect to m k /. Conversely, (s(7j) < ts(Tk) 
only allows the two options Wj\x\ < r l k [x] < m k ’ 1 and r l k [x] < Wk[x\ < Ck < with k = j. 
The former option indeed causes a normal edge and so, t. t to i must potentially be updated 
(Line 39). The latter option is impossible since the base protocol causes the cached result 

k l 

referenced by //(,■' to be invalidated right after executing of Wk[x\. Eventually, methodOp () 
tests the above stated invariant for t due to the potential change of t. ts/,- r and t. t to i (Line 
40). 

Finally consider the functioning of commit (): At first a timestamp is assigned to t .ts 
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(Line 44). Since commit () is synchronized, all committing transactions are totally ordered 
and so is their timestamp. In concordance with Definition 16 t’s fitting timestamp is set to 
t. ts if it hasn’t got a lower timestamp yet (Line 45). 

Because now, t’s timestamp is known, all conflict edges between t and active transac¬ 
tions can be checked to see whether they are reverse or normal edges and if they violate 
”f-fitting”. The Lines 46 to 49 determine all related reverse edges and update an active 
transaction’s fitting timestamp s. ts/, f accordingly. Note that a related conflict is guaran¬ 
teed to cause a reverse edge. To see this, let again be 7} =t and 7) = s. A normal edge 
would lead to the situation w,[x] < r l k [x\ < nij 1 < c,- but this contradicts the assumption that 
the resource manager guarantees strictness for rvv-histories. Line 50 checks if s must be 
aborted because of a change of s. ts ju in Line 49. 

The Lines 51 to 57 inspect active transactions s for normal edges t—>s£ MCSG and 
abort a respective transaction s if r-fitting is violated due to t. In analogy to the case from 
the Lines 46 to 49, it can be shown that conflicts inspected by the Lines 55 to 57 always 
lead to normal edges. 

5.3 Memory Management 

So far the data structures used in Figure 6 would unboundedly grow with the number of 
transactions and operations that the system processes. The following paragraphs briefly 
describe how to limit the size of these data structures without changing the functioning of 
the discussed implementation. 

The first question to answer is when entries for a certain transaction may be deleted 
because they don’t affect the processing of active transactions anymore. A closer look at 
Figure 6 leads to two different cases to be considered: Due to the Lines 21, 27, 29, 40, 52, 
54, and 57 an (active) transaction 7] may be aborted if some other transaction 7) produces 
a normal edge Tj — > 7] such that ts(Tj) > tSfuiTj) holds. For this case it suffices to retain 
the entries for just those transactions contained in the following set: 

M\ = { t | t .ts > min{ s.tsfu | s is active }}. 

The second case covers Line 37 where the fitting timestamp of a committed transaction 
is assigned to the fitting timestamp of an active transaction. Therefore, one also needs to 
retain the entries of transactions t contained in the following set: 

M 2 = { t | t. ts > min{ s . ts/;;|3 (x, (k, 1) ) GV: xGs.wlA txId2Ts (k) < s . ts}}. 

Finally Line 49 also affects the fitting timestamp of active transactions but since it only 
passes on the fitting timestamp of a transaction that is about be committed, the respective 
entry is already contained in M\. 

The joint set M\ LJ/VL forms the set of transactions whose entries need to be retained, but 
how can its size be controlled? There are two ways to do this: Firstly, one can delete entries 
(x, (k, 1)) from V which are stale because some transaction Tj with a younger timestamp 
than 7) has preformed an operation wj[x\. This reduces the size of M 2 . Alternatively, 
an active transaction can be aborted in order to reduce the size of M\. Finding the right 
candidates to be removed from M\ UM 2 can be done efficiently. (A detailed discussion 
of this process is beyond the scope of this paper.) Moreover, practical experience such as 
from the experiments of the next section show that the size of M\ UMi is not a critical 
system factor. 
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By controlling \M\ UM 2 I, one can limit the size of the data structures rt, wt, mt and 
txld2ts from Figure 6. Still, V may grow unboundedly because it must hold an entry 
for every valid cached method result but there may be arbitrary many of those results 
(in arbitrary many caches). To tackle this problem, V should be limited by a fixed (but 
reasonably high) upper bound. Then, an LRU-strategy can be used to replace respective 
entries in V. By extending the base protocol from Section 2.2 the client cache that stores 
a method result which is associated with a replaced entry of V can be notified in order to 
erase the result. 

A last thing to consider is that due to invalidation delays for cached method results, 
methodOp () can potentially be called with an argument value m for which the respective 
entry in V has already been replaced (or removed by controlling M 2 ). For this reason 
methodOp () must be adjusted to check the validity its argument value m. To do so, the 
following code should be inserted after Line 33 of Figure 6: 

if (V -1 (m) = 0) { abort (t); return; } 


6. EVALUATION 

In this section we briefly justify the intellectual investment in transactional method caching 
by giving evidence that the approach can considerably improve system scalability and per¬ 
formance. 

6.1 Experiment 

We implemented a prototype of a transactional method cache and an m-scheduler on top 
of the EJB application server product JBOSS v3.2.3 [JBoss ]. The implementation of the 
cache’s base protocol follows the architecture from Section 2.2. The relational database 
management system MySQL v4.0.18 [MySQL ] serves as a resource manager. The client 
is a multithreaded Java program performing remote service method invocations. The client, 
the application server and the database system are hosted on three separate PCs in a local 
network, whereby the PCs’ hardware suits up-to-date desktop standards (including a 1.2 
GHz Pentium 4 Processor and 512 MB RAM). The PCs operate under Windows XP. By 
observing the related system resources we ensured that neither network bandwidth nor the 
load on the client machine represented a potential bottleneck for the experiment. 

The experiment’s database consists of a single SQL table with the following structure: 

item(id int primary key, name varchar(50), descr varchar(250), 
price float, weight float, manuf varchar(50)) 

Using an auxiliary program the table was filled with 1 million random valued entries. At 
the application server, an EJB session bean implemented a service interface according to 
Figure 7. The method findltemByld () reads a database entry from the item-table via 
JDBC [Sun b] and returns the contents of a related table row as an Item-object. The 
related table row is queried via its key value using the method’s id-argument. Similarly 
updateltem() , changes a table row according to the Item-object which is passed in as 
an argument. The related table row is accessed via its key value using the Item-object’s 
id-field. (If no such row exists, the method throws an exception.) For the database the 
SQL isolation level was set to ’’SERIALIZABLE”. On this level, MySQL performs a strict 
(1-version) two-phase lock protocol with row level locking. 

The ///-scheduler is implemented as a delegating JDBC driver and incorporates the pro¬ 
tocol from Section 5.2. As explained at the end of Section 2.2, we had to insert extra code 
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public interface ItemSession extends javax.ejb.EJBObject { 
public Item findltemByld(int id} throws RemoteException; 
public void updateltem(ltem item) throws RemoteException; 

} 


public class Item implements java.io.Serializable { 
public int id; public String name; 
public String description; public double price; 
public double weight; public String manufacturer; 

} 


Fig. 7. Java Pseudo Code of the Experiment’s Service Interface 


behind the service methods’ JDBC statements in order to inform the ///-scheduler about the 
accessed table rows. (In this respect, the corresponding id-value was chosen to identify a 
data element.) 

The client contains a single transactional method cache. The cache applies an LRU 
replacement strategy with a limit of 4000 storable method results. A variable number of 
client threads perform transactions concurrently. Every transaction consists of 10 method 
calls addressing the server’s EJB interface. 

For every call a client thread chooses randomly whether to call findltemByld () or 
updateltem (). findltemByld () is invoked with the probability p r = 0.8 whereas updateltem () 
has the probability 1 — p r . After finishing the 10 calls successfully, the thread commits (re¬ 
spectively aborts) its transaction with a chance of p c = 0.95 (respectively 1 — /?,). 10 At last 
the thread pauses for 1 second before starting a new transaction (no matter if the previous 
transaction committed or aborted). 

An important parameter that determines the experiment’s cache hit rate as well as the 
cache invalidation rate is the value of the id-argument when calling findltemByld () and 
the value of item, id when calling updateltem(). The client uses a random distribu¬ 
tion to compute a corresponding value, whereby 1 million item-table rows are potentially 
referenced. 

During a warmup phase the cache fills up to its maximum size of 4000 method results. 

After that the probability that a service method call causes a hit is 53% (this chance im¬ 
plies the event of invoking findltemByld ()). The probability is mainly caused by the 
given cache size and the chosen random distribution for generating id-values which is not 
uniform. * 11 The chance of invalidating a cached method result (due to a respective call of 
updateltem ()) is about 13.25% (= (1 — p r )/p r ■ 53%). 

One may ask, why we did not resort to an existing benchmark application instead of de¬ 
signing the experiment from above. Unfortunately there are no useful and realistic bench¬ 
marks for testing client-side transactions in the application server domain. RUBiS [Cecchet 
et al. 2002; Cecchet et al. 2001; ObjectWeb ] is an EJB-benchmarkthat comes close to our 
needs and models an auction web site which is similar to eBay.com. However, the bench¬ 
mark does not account for client-side transactions and cannot be reasonably adjusted to 
make use of this feature. 


10 We have also tried other transaction lengths varying between 5 and 25 calls per transaction. The results are very 
similar to the chosen value of 10 method calls per transaction. 

11 Essentially we employed a log-normal distribution with the standard parameters /j = 7 and a = 1.6. 
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Number of Clients 

■ No Caching • Base Prot. Base Prot., No Hits OCC-Like Prot. OCT Prot. 


Fig. 8. Committed Transactions as a Function of the Number of Concurrent Client Threads (Throughput) 


Still, the main input parameters that govern the experiment from above represent conser¬ 
vative estimates of similar parameters that result from applying non-transactional method- 
caching to RUBiS. In particular, [Pfeifer and Jakschitsch 2003] observed cache hit rates 
between 53% and 78% when applying non-transactional method caching to RUBiS. [Cec- 
chet et al. 2001] considers a fraction of about 85% of read-only method calls as most 
representative for an auction web site workload. (In contrast, we are more conservative by 
setting p r = 80%.) 

We therefore believe, that transactional method caching can cause similar results as for 
the given experiment when it is applied to real world applications. Moreover, due to the 
experiment’s simplicity, its input parameters are clear and its results are well traceable. 
Beyond these considerations, [Pfeifer and Jakschitsch 2003] has already shown that non¬ 
transactional method caching produces very good efficiency improvements when applied 
to RUBiS. 

6.2 Results 

For the results presented next, every data point corresponds to a two minute measuring 
period. The measuring period was preceded by a two minute warmup phase in order to 
fill the method cache. By conducting additional test experiments we ensured that both 
the duration of the measuring phase as well as the warmup phase produced representative 
values. 

Figure 8 shows the number of committed transactions per minute for a varying number 
of concurrent client threads under five different system configurations. The graph ”No 
Caching” represents the respective results for the system without using a method cache. 
The graph ”OCT Prot.” depicts the results if transactional method caching is applied using 
the m-scheduler protocol from Section 5 (OCTP). A simpler transactional protocol which 
is similar to the classical OCC protocol from [Adya et al. 1995] has also been tested (see 
also Section 5.1). The fourth graph displays system behavior when a method cache is 
used while only applying the base protocol from Section 2.2.2. This option would hardly 
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■ No Caching • Base Prot. ' Base Prot., No Hits OCC-Like Prot. OCT Prot. 


Fig. 9. Average Duration of a Transaction that Executed 10 Service Method Calls (Response Time) 



Number of Clients 

■ No Caching • OCC-Like Prot. OCT Prot. 


Fig. 10. Percentage of Aborted Transactions in Respect to Stalled Transactions 


be applied in practice since it does not provide transactional consistency. It was added 
to Figure 8 because it gives an impression of the overhead of an m-scheduler protocol as 
opposed to the pure base protocol. Similarly the graph ’’Base Prot., No Hits” shows system 
behavior when applying the base protocol but not granting any cache hits. This graph helps 
to characterize the overhead of the base protocol versus a system without method caching. 

All system variants scale well with an increasing number of concurrent client threads. 
However, system variants using method caching attain a considerably higher level of trans¬ 
action throughput. By comparing ”No Caching” and ’’Base Prot., No Hits” one can see 
that the additional cost for the base protocol remains moderate. The m-scheduler protocols 
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Web Application 


Fig. 11. Common Tiers of Web Application Architectures and Related Options for Caching. 

reduce the transactional throughput in comparison to a "pure” base protocol, because they 
abort a fraction of transactions for consistency reasons. 

Figure 9 illustrates the average duration of a successful transaction for the same runs 
as in Figure 8. Here, method caching considerably shortens transaction runtimes and so it 
improves system performance. As in Figure 8 one can observe the cost of the base protocol 
and the m-scheduler protocols which are both moderate. 

Finally, Figure 10 shows the transaction abortion rate for those runs from Figure 8 which 
maintain transactional consistency. Obviously abortions become more likely with an in¬ 
creasing number of concurrent transactions. The worst abortion rate is observed for the 
OCC-like protocol - transactions may be aborted by the m-scheduler as well as the database 
system. For the system variant without method caching only the database system aborts 
transactions. Surprisingly a system with method caching using OCTP has lower abortion 
rates than the variant without method caching! The reason for this is that OCTP allows 
even (some) transactions to commit that have caused cache hits on stale cached method 
results. As opposed to that, the OCC-like protocol always aborts transactions accessing 
stale cached method results and so, OCTP has a better quality. In essence, OCTP es¬ 
tablishes a kind of a consistent multi-version transaction scheduling policy in respect to 
cached method results. 

All in all, the experiments give evidence that using OCTP, transactional method caching 
can improve system throughput, response time as well as transaction abortion rates. 

7. RELATED WORK 

7.1 Web Application Caching 

In the last years, research as well as industry has made various efforts to improve the 
performance of web applications by means of caching. Since transactional method caching 
can be beneficial in the context of web applications, we briefly compare it against other 
caching approaches in this field and discuss the advantages and disadvantages. 

Figure 11 shows the tiers of a typical web application architecture and highlights where 
caches potentially come into play: 

— Application data caching happens somewhere in between the database and the applica- 
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tion server tier. If it is done right in front of the database [Grembowicz 2000; Luo et al. 
2002; Larson et al. 2003; The TimesTen Team 2000], abstractions of database queries 
are associated with query results in the cache. In case of a cache hit, the query result 
is immediately returned by the cache as opposed to running the database query engine. 
At the server side, application data is cached either programmatically through runtime 
objects whose structure has been designed by the application developer [Apache Group ; 
jcache ] or it is controlled by an object-relational mapping framework [Oracle ; Software 
Tree ]. 

— Web page caching usually occurs in front of a servlet- or script-enabled web server. 
Beyond the simple task of caching static pages, there are also many approaches for 
caching dynamically generated web pages [Anton et al. 2002; Challenger et al. 1999; Li 
et al. 2002], 

— A method cache is inserted at the ’’backend” of a servlet- or script-enabled web server 
from where application server calls are initiated. While [Pfeifer and lakschitsch 2003] 
discussed non-transactional method caching, this paper is the first one presenting a so¬ 
lution for transactional method caching. 

The major problem of application data caches is that they can only save the cost of 
database queries but no cost originating at the application server tier. Therefore caching of 
service method results has a higher potential for improving system efficiency. In contrast, 
the pure cost for executing page generation scripts at the Web server tier is rather low and 
so, there is not much gain when caching dynamic Web pages instead of service method 
results. 

One important question that all dynamic web caching strategies must deal with is when 
and how to invalidate cache content. In [Candan et al. 2001; Li et al. 2002] and [Luo 
and Naughton 2001] URLs of dynamic pages on the web server side are associated with 
dependent SQL queries on the database level. If a database change affects a correspond¬ 
ing query, the related pages in the cache are invalidated. In [Candan et al. 2001; Li et al. 
2002] dependencies between queries and URLs are automatically detected through sniff¬ 
ing along the communication paths of a web application’s tiers. Although the approach 
observes database changes, it provides only a weak form of update consistency, whereas 
our approach ascertains full transactional consistency. 

Other strategies for dynamic web page caching require a developer to provide explicit 
dependencies between URLs of pages to be cached and URLs of other pages that invalidate 
the cached ones [Persistence Software 2001]. Often, server-side page generation scripts or 
database systems may also invalidate a cached page by invoking invalidation functions of 
the web cache’s API [Anton et al. 2002; Spider Software 2001; XCache Technologies ]. 
Unfortunately these strategies are invasive which means that application code (e.g. page 
generation scripts) has to be changed. In contrast, our approach is completely transparent 
to the client code and requires only minor changes at the ser\’er-side code. Therefore it 
can be applied even in late cycles of application development. 

An explicit fragmentation of dynamic web pages via annotations in page generation 
scripts helps to separate static or less dynamic aspects of a page from parts that change 
more frequently [Datta et al. 2001; ESI ]. Also, dependencies such as described in the 
previous paragraph can then be applied to page fragments instead of entire pages. In this 
respect, our approach enables an even more fined grained fragmentation as it treats depen¬ 
dencies on a level where page scripts invoke service methods from the application server. A 
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great benefit, is that explicit page fragmentation annotations (such as supported by [ESI ]) 
then become obsolete. This also leads to the conclusion that caching the results of sendee 
method calls causes cache hit rates which are at least as good as in the case of dynamic 
Web caching (or even better). 

7.2 Conventional Transaction Protocols 

This section highlights the differences between conventional transaction protocols and the 
approach described in this paper. 

Existing work in the field of transactional caching relates to page server systems, where 
a client can download a database page to its local cache, change it and eventually send 
those changes back to the server [Franklin et al. 1997]. For these systems the cache proto¬ 
col ensuring transactional consistency forms an integral part of the database system itself. 
In contrast, this paper’s approach assumes that a tight integration with a given database 
system is not possible. Moreover, the presented approach accounts for the characteris¬ 
tics of an application server that does not enable direct access to data elements such as 
pages. Therefore, we described how to extend an application server architecture to enable 
consistent client-side method caching. The cache protocol is designed so that is does not 
alter the standard communication flow between client and server. Also, the unit for ensur¬ 
ing the transactional consistency - the m-scheduler - remains separate from an underlying 
resource manager (such as a database system). 

In order to develop an efficient protocol for the m-scheduler we presented a theory for 
reflecting the use of cached method results inside transactions. Without this theory, proving 
the correctness of OCTP would have been very difficult. As opposed to that, the correctness 
of conventional transactional cache protocols such as OCC [Adya et al. 1995] or CBR 
[Franklin et al. 1997] is more obvious and does not demand formal considerations. 

An important difference between OCTP and other conventional transactional cache pro¬ 
tocols is that OCTP does neither avoid access to stale cache entries (such as CBR) nor 
necessarily abort transactions which have accessed stale cache entries (such as OCC). 
Therefore, in spite of being optimistic, OCTP can offer low transaction abortion rates. 

With respect to the taxonomy of [Franklin et al. 1997] OCTP is a ’’detection based pro¬ 
tocol” whereby a validation may be ’’deferred until commit”. Further, OCTP gives invali¬ 
dation hints ’’during a transaction” and uses ’’invalidation” (as opposed to ’’propagation”) 
as its ’’remote update action”. Propagation as a remote update action is not applicable since 
the m-scheduler has no access to a method call’s arguments which are needed for recom¬ 
puting the method result that would have to be propagated. According to the taxonomy of 
[Gruber 1997] OCTP supports ’’early aborts” and may be classified as ’’lazy reactive”. 

Apart from transactional cache protocols, OCTP has a similarity to the multiversion 
timestamp protocol (MVTO) from [Reed 1983]. Let 7) be a transaction with an operation 
r,-[x] but without a prior w,-[x]. At MVTO, r,-[x] reads the version Xk that was written by a 
committed transaction 7* such that ts(Tk) =max{fs(7)) | ts(Tf) < ts(Tj) Aw ; -[x] g Tj] holds. 

k l 

Scheduling an operation m i ' at OCTP is similar to scheduling r, |xj at MVTO. However, 
at OCTP the version of a respective data element is already fixed by the cache hit itself, 
namely by mf . Therefore, the m-scheduler cannot choose xy but can only determine where 
Tj would best ’’fit” in the given timestamp order. In order to do so the m-scheduler computes 

tSfit(Tj). 

The fitting timestamp tsjj, from Definition 16 is also connected to the concept of dy- 
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namic timestamps from [Bayer et al. 1982]. In [Bayer et al. 1982] a scheduler may delay 
the assignment of timestamps to transactions in order to accept a broader range of serial¬ 
izable histories. A respective timestamp is therefore called dynamic. Although OCTP’s 
fitting timestamp may change dynamically, a related transaction’s real timestamp ts(T{) is 
always dictated by the rw-scheduler and therefore it is not dynamic. This is the crucial 
difference between OCTP and the proposition from [Bayer et al. 1982]. 

8. CONCLUSION 

This paper has presented an approach for the transactional caching of method results in 
the context of application server systems. A related cache is placed at the system’s client 
side. It comes into play when the client performs a sequence of method calls addressing 
the server, whereby the calls are demarcated by an ACID transaction. If the client invokes 
a read only method with the same arguments for the second time the related result can po¬ 
tentially be taken from the cache which avoids an execution at the server. For a reasonable 
hit rate, the approach is inter-transactional meaning that a cached method result can be 
used by multiple client transactions. 

The paper has adjusted the conventional architecture of an application server in order to 
enable transactional method caching. Since the use of cached method results alters the way 
a transaction is processed, it must be regarded when ensuring transactional consistency. 
Therefore, we introduced an new system component at the server side which maintains 
transactional consistency in the presence of cache hits. This so called ///-scheduler observes 
cache hit operations as well as normal data access operations ascertains serializability of 
client transactions. 

To develop a protocol for an ///-scheduler, the paper extended the conventional 1-version 
transaction theory by an operation which reflects the use of cached method results. We 
derived a definition for serializability in respect to the extended transaction histories and 
proved a corresponding serializability theorem. 

Using these theoretical results, we developed an efficient recovery protocol as well as 
an efficient serializability protocol for an //(-scheduler and proved their correctness. More¬ 
over, the paper discussed some of the protocols’ implementation aspects. An experimental 
evaluation showed that the presented cache can considerably improve system performance 
and scalability as well as transaction abortion rates. 

A limitation of the approach is that in order to guarantee transactional consistency, the 
///-scheduler needs to observe all data access operations addressing an underlying resource 
manager. Also, it does have to make some basic assumptions about the resource managers’ 
transaction management protocols. The clear advantage of having a separate scheduler for 
///-operations is that an integration of the presented cache only requires the modification 
of the application server and its clients but not the resource manager(s). This is especially 
important for practical considerations since many modern application servers systems are 
open and well extendable, but most database management systems are not. 

Note, that the stated limitation would be uncritical, if an ///-scheduler was integrated in a 
resource manager. Still, the major contributions of this paper, namely the presented theory, 
the recovery protocol and the transactional cache protocol also apply to this case. 

As part of our future work we would like to apply the idea behind OCTP to the domain of 
page server systems. In this field many transactional cache protocols have been studied (for 
an up-to-date comparison see [Wu et al. 2004]). However, as explained in Section 7.2, none 
of them allow transactions to commit who have accessed stale cache entries. Currently, 
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transaction protocols for page servers either enable moderate efficiency combined with 
low abortion rates (e.g. CBR) or high efficiency combined with potentially intolerable 
abortion rates (e.g. OCC). In contrast, an OCTP-like protocol for page servers could bring 
together high efficiency (via optimism) and low abortion rates (by tolerating access to stale 
cache entries) while still ensuring serializability. 
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